1(Amended). A processor including [a memory, a plurality of execution units coupled to the memory and] an array prefetch apparatus for transferring array data from [the] a memory to [the plurality of] an execution [units] unit in the processor,

the array prefetch apparatus comprising:

an array prefetch queue pointer coupled to the memory for receiving array data;

a first array prefetch queue pointer coupled to the array prefetch queue for designating in the array prefetch queue a location for loading the array data;

a second array prefetch queue pointer coupled to the array prefetch queue for designating in the array prefetch queue a location for accessing the array data; and

an array prefetch controller coupled to the array prefetch queue and the first and second array prefetch queue pointers, the array prefetch controller for executing a load operation as an array load operation and an array move operation, the array load operation for accessing the array data from the memory and transferring the array data to the array prefetch queue at the location designated by the first pointer, the array move operation for moving the array data from the array prefetch queue at the location designated by the second pointer for accessing by the execution [units] unit of the processor; and

a loop control logic supporting software pipelining of loops, the
loop control logic for executing a plurality of stages (S) in a compiled,
pipelined loop schedule of T cycles having the iteration interval I, in which

A2

Ar

the loop control dynamically controls the number of stages in an iteration as a function of the latencies of memory read operations.

Cancel claim 5.

18

(Amended). A processor [according to Claim 1, further comprising:] including an array prefetch apparatus for transferring array data from a memory to an execution unit in the processor,

the array prefetch apparatus comprising:

an array prefetch queue coupled to the memory for receiving array data:

a first array prefetch queue pointer coupled to the array prefetch for designating in the array prefetch queue a location for loading the array data;

a second array prefetch queue pointer coupled to the array prefetch queue for designating in the array prefetch queue a location for accessing the array data; and

an array prefetch controller coupled to the array prefetch queue and the first and second array prefetch queue pointers, the array prefetch controller-for executing a load operation as an array load operation and an array move operation, the array load operation for accessing the array data from the memory and transferring the array data to the array prefetch queue at the location designated by the first pointer, the array move operation for moving the array data from the array prefetch queue at the location designated by the second pointer for accessing by the execution units of the processor; and

H3

3 - 4

a loop control logic supporting software pipelining of loops in a horizontal processor, the loop control logic including:

a loop mode flag indicative of a current loop mode status, the loop mode flag being set when a loop is executed;

a loop counter indicative of a first remaining number of logical iterations in the loop being executed;

a prologue counter indicative of a second remaining number of logical iterations in a prologue portion of the loop being executed; and

first enabling/disabling logic coupled to the loop mode flag and to the prologue counter, the first enabling/disabling logic disabling execution of operations in a first class of operations having side effects.

Cancel claims 12 and 13.

(Amended). A method of transferring array data from a memory to a register during run-time of a compiled and compacted loop program comprising the steps of:

> designating in an array prefetch queue a location for loading array data;

> designating-in the array prefetch queue a location for accessing the array data and moving the array data to a register;

executing a load operation as a combination of an array load operation and an array move operation;

for the array load operation, accessing the array data from the memory and transferring the array data to the array prefetch queue at the location for loading array data;

H4

for the array move operation, moving the array data from the array prefetch queue at the location designated by the second pointer to a register designated by the array move operation.

Add the following claims:

(New). A processor including an array prefetch apparatus for transferring array data from a memory to a register, the array prefetch apparatus comprising:

an array prefetch queue coupled to the memory for receiving the array data;

an array prefetch queue tail pointer coupled to the array prefetch queue for designating in the array prefetch queue a location for loading the array data;

an array prefetch queue head pointer coupled to the array prefetch queue for designating in the array prefetch queue a location for accessing the array data and moving the array data to a register;

an array prefetch flag;

an array prefetch controller coupled to the array prefetch queue, the array prefetch flag and the first and second array prefetch queue pointers, the array prefetch controller for executing a load operation as a load operation for a first setting of the array prefetch flag and alternatively, for a second setting of the array prefetch flag, executing a load operation as a combination of an array load operation and an array move operation, the array load operation for accessing the array data from the memory and transferring the array data to the array prefetch queue tail pointer, the array move operation for moving the array data from the array prefetch queue at

A 5

the location designated by the array prefetch head pointer to a register designated by the array move operation; and

a loop control logic supporting software pipelining of loops, the loop control logic for executing a plurality of stages (S) in a compiled, pipelined loop schedule of T cycles having an iteration interval I, in which the loop control logic dynamically controls the number of stages in an other pipeline of the latencies of memory read operations.

22 (New). A processor including an array prefetch apparatus for transferring array data from a memory to a register, the array prefetch apparatus comprising:

an array prefetch queue coupled to the memory for receiving the array data;

an array prefetch queue tail pointer coupled to the array prefetch queue for designating in the array prefetch queue a location for loading the array data;

an array prefetch queue head pointer coupled to the array prefetch queue for designating in the array prefetch queue a location for accessing the array data and moving the array data to a register;

an array prefetch flag;

array prefetch controller coupled to the array prefetch queue, the array prefetch flag and the first and second array prefetch queue pointers, the array prefetch controller for executing a load operation as a load operation for a first setting of the array prefetch flag and alternatively, for a second setting of the array prefetch flag, executing a load operation as a combination of an array load operation and an array move operation, the array load operation for accessing the array data from the memory and

#5

transferring the array data to the array prefetch queue at the location designated by the array prefetch queue tail pointer, the array move operation for moving the array data from the array prefetch queue at the location designated by the array prefetch head pointer to a register designated by the array move operation; and

a loop control logic supporting software pipelining of loops in a horizontal processor, the loop control logic including:

a loop mode flag indicative of a current loop mode status, the loop mode flag being set when a loop is executed;

a loop counter indicative of a first remaining number of logical iterations in the loop being executed;

a prologue counter indicative of a second remaining number of physical iterations in a prologue portion of the loop being executed; and

first enabling/disabling logic coupled to the loop mode flag and to the prologue counter, the first enabling/disabling logic disabling execution of operations in a first class of operations having side effects.

(New). A method according to claim 18 wherein array data is transferred from a memory to a register during run-time of a compiled and compacted loop operation.

## **REMARKS**

In a first office action dated February 10, 1998, claims 1-4, 7-11 and 14-20 were rejected under 35 U.S.C. § 102(b) as anticipated by Kinoshita et al. USP 5,201,058. Dependent claims 5, 6, 12 and 13 were indicated to be allowable over the art which did not teach the "loop" limitations set forth in these claims.

-7-